Very fast adaptation with a compact context-dependent eigenvoice model
نویسندگان
چکیده
The “eigenvoice” technique achieves rapid speaker adaptation by employing prior knowledge of speaker space obtained from reference speakers to place strong constraints on the initial model for each new speaker [9,10]. It has recently been shown to yield very fast adaptation for a large-vocabulary system [3] ([5] modifies the technique in an interesting way). In this paper, we describe a new way of applying the eigenvoice technique to context-dependent acoustic modeling, called the “Eigencentroid plus Delta Trees” (EDT) model. Here, the context-dependent model is defined so that it consists of a speaker-dependent component with a small number of parameters linked to a speaker-independent component with far more parameters. The eigenvoice technique can then be applied to the speaker-dependent component alone to attain very fast adaptation of the entire context-dependent model (e.g., 10% relative reduction in error rate after 3 sentences). EDT requires only a small number of parameters to represent speaker space and works even if only a small amount of data is available per reference speaker (in contrast to the system described in [3]).
منابع مشابه
Maximum a posteriori eigenvoice speaker adaptation for Korean connected digit recognition
In this paper, we present a maximum a posteriori (MAP) eigenvoice speaker adaptation approach to the self-adaptation system. The proposed MAP eigenvoice is developed by introducing a probability density model for the eigenvoice coefficients. And we make a self-adaptation system which is useful to public user, because user does not need to speak several sentences for adaptation. In self-adaptati...
متن کاملFast Speaker Adaptation Using a Przorz Knowledge
Recently, we presented a radically new class of fast adaptation techniques for speech recognition, based on prior knowledge of speaker variation. To obtain this prior knowledge, one applies a dimensionality reduction technique to T vectors of dimension D derived from T speaker-dependent (SD) models. This offline step yields T basis vectors, the eigenvoices. We constrain the model for new speake...
متن کاملFast speaker adaptation using a priori knowledge
Recently, we presented a radically new class of fast adaptation techniques for speech recognition, based on prior knowledge of speaker variation. To obtain this prior knowledge, one applies a dimensionality reduction technique to T vectors of dimension D derived from T speaker-dependent (SD) models. This offline step yields T basis vectors, the eigenvoices. We constrain the model for new speake...
متن کاملEigenvoices for speaker adaptation
We have devised a new class of fast adaptation techniques for speech recognition, based on prior knowledge of speaker variation. To obtain this prior knowledge, one applies Principal Component Analysis (PCA) [9] or a similar technique to a training set of T vectors of dimension D derived from T speaker-dependent (SD) models. This offline step yields T basis vectors, which we call “eigenvoices” ...
متن کاملEigenvoices for Hmm-based
This paper describes an eigenvoice technique for an HMMbased speech synthesis system which can synthesize speech with various voice qualities. In the eigenvoice technique, which has successfully been applied to fast speaker adaptation in an HMM based speech recognition, a large number of speaker dependent HMM sets are represented by a few parameters through a dimensionality reduction technique,...
متن کامل